Apache Spark Accelerated Deep Learning Inference for Large Scale Satellite Image Analytics
نویسندگان
چکیده
منابع مشابه
Approximate Stream Analytics in Apache Flink and Apache Spark Streaming
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation effi...
متن کاملGPU-Accelerated Large Scale Analytics
In this paper, we report our research on using GPUs as accelerators for Business Intelligence(BI) analytics. We are particularly interested in analytics on very large data sets, which are common in today's real world BI applications. While many published works have shown that GPUs can be used to accelerate various general purpose applications with respectable performance gains, few attempts hav...
متن کاملTowards Large Scale Environmental Data Processing with Apache Spark
Currently available environmental datasets are either manually constructed by professionals or automatically generated from the observations provided by sensing devices. Usually, the former are modelled and recorded with traditional general-purpose relational technologies, whereas the latter require more specific scientific array formats and tools. Declarative data processing technologies are a...
متن کاملMLlib: Machine Learning in Apache Spark
Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark’s open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shi...
متن کاملBenchmarking Apache Spark with Machine Learning Applications
We benchmarked Apache Spark with a popular parallel machine learning training application, Distributed Stochastic Gradient Descent for Matrix Factorization [5] and compared the Spark implementation with alternative approaches for communicating model parameters, such as scheduled pipelining using POSIX socket or MPI, and distributed shared memory (e.g. parameter server [13]). We found that Spark...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
سال: 2020
ISSN: 1939-1404,2151-1535
DOI: 10.1109/jstars.2019.2959707